Skip to content

Conversation

@prdoyle
Copy link
Contributor

@prdoyle prdoyle commented May 22, 2025

Background

I saw this error on a build on commit 4faef72:

REPRODUCE WITH: gradlew ":server:test" --tests "org.elasticsearch.index.snapshots.blobstore.SlicedInputStreamTests.testRandomMarkReset" -Dtests.seed=57395327998DD7F4 -Dtests.locale=ast-Latn-ES -Dtests.timezone=Asia/Yekaterinburg -Druntime.java=24
SlicedInputStreamTests > testRandomMarkReset FAILED
    java.lang.AssertionError: Reset at mark should not re-open slices
        at __randomizedtesting.SeedInfo.seed([57395327998DD7F4:57FCBCBA4A77F350]:0)
        at org.elasticsearch.index.snapshots.blobstore.SlicedInputStreamTests.testRandomMarkReset(SlicedInputStreamTests.java:181)

Hypothesis

In the case that the mark is at the very end of the stream, the slice's stream could already be closed by the time we call reset(), and so the stream must be reopened.

Test fix

Change the test so the above assertion passes whenever the mark is right at the end of the input.

In the case that the mark is at the very end of the stream,
the slice's stream could already be closed by the time we
call reset(), and so the stream must be reopened.
@prdoyle prdoyle requested a review from kingherc May 22, 2025 20:41
@prdoyle prdoyle self-assigned this May 22, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels May 22, 2025
@prdoyle prdoyle added >test Issues or PRs that are addressing/adding tests :Core/Infra/Core Core issues without another label auto-backport Automatically create backport pull requests when merged v8.19.0 and removed needs:triage Requires assignment of a team area label labels May 22, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label May 22, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@prdoyle prdoyle changed the title Allow for mark == bytes.length Allow for mark == bytes.length in SlicedInputStreamTests May 22, 2025
@ldematte ldematte added the :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs label May 23, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label May 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

@prdoyle prdoyle merged commit 940bea7 into elastic:main May 26, 2025
18 checks passed
prdoyle added a commit to prdoyle/elasticsearch that referenced this pull request May 26, 2025
In the case that the mark is at the very end of the stream,
the slice's stream could already be closed by the time we
call reset(), and so the stream must be reopened.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.19

@prdoyle
Copy link
Contributor Author

prdoyle commented May 26, 2025

@kingherc - I'm not totally confident I've identified the root cause, or that my fix covers all cases. For example, it doesn't cover the case that the mark is at the end of a slice other than the last slice. But at least the particular failure I observed won't happen anymore.

elasticsearchmachine pushed a commit that referenced this pull request May 26, 2025
In the case that the mark is at the very end of the stream,
the slice's stream could already be closed by the time we
call reset(), and so the stream must be reopened.
@kingherc
Copy link
Contributor

kingherc commented Jun 5, 2025

Great catch @prdoyle !

For example, it doesn't cover the case that the mark is at the end of a slice other than the last slice.

Just run the test with mark = end, and indeed the test was failing due to that. Debugging it, it seems that currentSliceOffset is 0 at the end of the stream, because there's no next slice to read from, so it's 0.

I don't believe it can happen at the end of an intermediate slice, as if a stream is ending, it's because there's a read of a byte, and that means we'll keep opening streams until we read at least a byte (thus making currentSliceOffset non-zero) or reaching the end of the bytes.

So indeed it seems you caught the corner case, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Core/Infra/Core Core issues without another label :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Core/Infra Meta label for core/infra team Team:Distributed Coordination Meta label for Distributed Coordination team >test Issues or PRs that are addressing/adding tests v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants